xAI launched the Grok Voice Agent API, which costs only $0.05 per minute, offering extremely high cost-effectiveness. The model performs best in audio reasoning benchmark tests, with a first sound delay of less than 1 second and a response speed nearly five times faster than competitors. It supports automatic detection and switching of dozens of languages including Chinese, and integrates real-time web search and reasoning capabilities to enhance response quality.
Meta acquires AI wearable company Limitless, integrating its smart pendant for voice interaction, real-time transcription, and search to boost focus and memory. The team joins Meta's next-gen AI hardware development, following strategic adjustments.....
Google tests merging 'AI Overview' and 'AI Mode' on mobile, enabling multi-turn conversations directly on search results without page jumps. It supports text, voice, and image inputs, with conversations up to three times longer than traditional searches, while retaining source citations and web rankings. The VP of Product states the redesign aims to eliminate user choice costs between search and chat, facilitating continuous queries and instant r....
Google Assistant to be discontinued by March 31, 2026, with Gemini taking over core platforms. Key milestones: Gemini defaults for voice search in Dec 2024, full Nest integration by June 2025. Transition period until Q1 2026.....
An advanced version of ChatGPT, featuring functionalities such as folders, search, GPT Store, image library, voice GPT, export options, custom prompts, prompt chains, and hidden models.
Voice AI Search Extension
Powerful speech-to-text API
Anthropic
$105
Input tokens/M
$525
Output tokens/M
200
Context Length
Google
$0.7
$2.8
1k
Alibaba
-
$3.9
$15.2
64
Bytedance
$0.8
$2
128
$54
$163
Tencent
$1
$4
32
Baidu
$8
256
Chatglm
$16
$2.4
$12
8
Xai
$21
Deepseek
$3
$9
A Gmail email management server based on the Model Context Protocol (MCP), supporting searching, reading, deleting, and sending emails through an AI agent. It needs to be used with a voice interaction client.